We want to explore the relationship of in-person voting VS mail-in voting and Covid-19 case development.
The overall pandemic situation before the voting period starts might lead to people’s preference of mail-in voting. This could be somewhat proved by the fact that people absence of in-person voting could register for mail ballot and fill in “Covid” as the excuse.
To understand this, we plot a trend of mail-in voting percentage in elections since 2000 to see if there is an evidence of increasing favoritism in mail-in voting versus other voting modes.
The background knowledge suggests that Trump administration and its supporters oppose mail-in voting, so we’ll also be looking at the difference of mail-in voting preference change for Democrats and Republicans respectively.
There is a sharp increase in general mail-in voting percentage from 2016 to 2020, which could well be due to the Covid-19 development.
There is also a sharp increase from 2016 to 2020 by party. Particularly, Democrats favor the idea of mail-in voting much more than Republicans.
Data Source: https://dataverse.harvard.edu/dataverse/SPAE
We’ll then explore if different voting mechanisms could have an impact on number of cases. How does the percentage of mail-in voting affects the increase of Covid-19 cases? Do democratic states and republican states differ in the increase of Covid-19 cases during the whole voting period, since they might have different policies and preferences regarding in-person vs mail-in voting at state government level?
In the leaflet map, polygons are used to reflect the percentage of mail-in voting. Each state is categorized as a typical Democratic state or a Republican state based on the percentage of Democrats and Republicans from the 2020 SPAE, which is represented by the color of the state’s border on the map.
Data Source: Stewart, Charles, 2021, “2020 Survey of the Performance of American Elections”, https://doi.org/10.7910/DVN/FSGX7Z, Harvard Dataverse, V1, UNF:6:70KW4uouuTDT860MiPJq3A== [fileUNF]
In the leaflet map, polygons are used to reflect the percentage of Covid cases increase in the voting period till 7 days after election day. We have only kept data during the voting period. The election day is November 3, and the earliest voting time is 46 days before the election day. Reference from Early Voting Calendar. The end of our observation date is 7 days after the election day. Each state is categorized as a typical Democratic state or a Republican state based on the percentage of Democrats and Republicans from the 2020 SPAE, which is represented by the color of the state’s border on the map.
Data Source for Covid cases: United States CDC
Data Source for population: United States Census Bureau
Run a regression to see the relationship between mail-in voting percentage and weighted covid cases and visualize the relationship.
## [1] "Regression Result:"
##
## Call:
## lm(formula = log(statetotalw) ~ mail, data = alldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7131 -0.1719 0.1612 0.3351 0.7315
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 14.2730 0.1571 90.855 <2e-16 ***
## mail -0.6944 0.3009 -2.308 0.0253 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5118 on 49 degrees of freedom
## Multiple R-squared: 0.09802, Adjusted R-squared: 0.07961
## F-statistic: 5.325 on 1 and 49 DF, p-value: 0.02529
The result suggests that on average, 1 percent increase in mail-in voting proportion is associated with a 69.4 percent decrease of weighted covid cases in a state, not taking into other factors into account, and the influence is significant.
| Date | City | County | State | Indoors. | People.Counting |
|---|---|---|---|---|---|
| 6/20/20 | Tulsa | Tulsa | Oklahoma | yes | 6200 |
| 6/23/20 | Phoenix | Maricopa | Arizona | yes | 3000 |
| 8/17/20 | Mankato | Blue Earth | Minnesota | no | 500 |
| 8/17/20 | Oshkosh | Winnebago | Wisconsin | no | 1000 |
| 8/18/20 | Yuma | Yuma | Arizona | no | NA |
| 8/20/20 | Old Forge | Lackawanna | Pennsylvania | no | NA |
| 8/28/20 | Londonberry | Rockingham | New Hampshire | no | 1000 |
| 9/3/20 | Latrobe | Westmoreland | Pennsylvania | no | 7000 |
| 9/8/20 | Winston-Salem | Forsyth | North Carolina | no | 15000 |
| 9/10/20 | Freeland | Saginaw | Michigan | no | 10000 |
| 9/12/20 | Minden | Douglas | Nevada | no | 5000 |
| 9/13/20 | Henderson | Clark | Nevada | yes | NA |
| 9/17/20 | Mosinee | Marathon | Wisconsin | no | NA |
| 9/18/20 | Bemidji | Beltrami | Minnesota | no | NA |
| 9/19/20 | Fayetteville | Cumberland | North Carolina | no | 5600 |
| 9/21/20 | Swanton | Lucas | Ohio | no | NA |
| 9/21/20 | Vandalia | Vandalia | Ohio | no | 10000 |
| 9/22/20 | Pittsburgh | Allegheny | Pennsylvania | no | NA |
| 9/24/20 | Jacksonville | Duval | Florida | no | 15000 |
| 9/25/20 | Newport News | Newport News | Virginia | no | 700 |
| 9/26/20 | Middletown | Dauphin | Pennsylvania | no | NA |
From the above graph, we could see after the rally event at 2020-09-17, the slope of line between 2020-09-12 to 2020-09-19 is steeper than 2020-09-06 to 2020-09-13. This might implies the Covid 19’s spread in Marathon(Mosinee), Wisconsin speed up after the rally.
From the above graph, we could see after the rally event at 2020-08-17, the slope of line between 2020-08-13 to 2020-08-20’s slope is very similar with before. This might implies the election rally has no obvious effect on Blue Earth(Mankato), Minnesota.
From the above graph, we could see after the rally event at 2020-09-10, the slope of line between 2020-09-06 to 2020-09-13 is flatter than before. This might implies the Covid 19’s spread slows down after the rally in Saginaw(Freeland), Michigan.
From the above graph, we could see after the rally event at 2020-08-17, the slope of line between 2020-08-13 to 2020-08-20 is flatter than before. This might implies the Covid 19’s spread slows down after the rally in Winnebago(Oshkosh),Wisconsin.
From the above graph, we could see after the rally event at 2020-06-10, the slope of line between 2020-06-20 to2020-06-27 is much steeper than before. This might implies the election rally speed up the Covid 19’s spread in Tulsa, Oklahoma.
From the above graph, we could see after the rally event at 2020-06-23, the slope of line between 2020-06-20 to 2020-06-27 is steeper than before. This might implies the election rally speed up the Covid 19’s spread in Maricopa(Phoenix), Arizona.However, the effect is not very obvious.
From the above graph, we could see after the rally event at 2020-08-18, the slope of line between 2020-08-13 to 2020-08-20 is slightly steeper than before. This might implies the Covid 19’s spread in Tulsa, Oklahoma slightly speed up after the rally.
From the above graph, we could see after the rally event at 2020-08-20, the slope of line between 2020-08-19 to 2020-08-26 is steeper than 2020-08-13 to 2020-08-20. This might implies the Covid 19’s spread in Lackawanna(Old Forge), Pennsylvania speed up after the election rally.
From the above graph, we could see after the rally event at 2020-08-28, the slope of line between 2020-08-25 to 2020-09-01 is almost the same as before. This might implies the Covid 19’s spread in Rockingham(Londonberry),New Hampshire slightly has no obvious change after the election rally.
From the above graph, we could see after the rally event at 2020-06-10,the slope of line between 2020-09-06 to 2020-09-13 is almost the same as 2020-08-25 to 2020-09-01. This might implies the Covid 19’s spread in Westmoreland(Latrobe),Pennsylvania has no obvious change after the election rally.
From the above graph, we could see after the rally event at 2020-09-08, the slope of line between 2020-08-13 to 2020-08-20 is flatter than before. This might implies the Covid 19’s spread slows down after the rally in Forsyth(Winston-Salem), North Carolina.
From the above graph, we could see after the rally event at 2020-09-12, the slope of line between 2020-09-12 to 2020-09-19 is steeper than 2020-09-06 to 2020-09-13. This might implies the Covid 19’s spread in Douglas(Minden), Nevada speed up after the rally.
From the above graph, we could see after the rally event at 2020-09-13, the slope of line between 2020-09-12 to 2020-09-19 is almost the same as 2020-09-06 to 2020-09-13. This might implies the Covid 19’s spread in Clark(Henderson), Nevada has no obvious change after the election rally.
From the above graph, we could see after the rally event at 2020-09-18, the slope of line between 2020-09-18 to 2020-09-25 is almost the same as 2020-09-12 to 2020-09-19 This might implies the Covid 19’s spread in Beltrami(Bemidji), Minnesota has no obvious change after the election rally.
From the above graph, we could see after the rally event at 9/19/20, the slope of line between 2020-09-18 to 2020-09-25 is almost the same as 2020-09-12 to 2020-09-19 This might implies the Covid 19’s spread in Cumberland(Fayetteville), North Carolina has no obvious change after the election rally.
From the above graph, we could see after the rally event at 2020-09-21,the slope of line between 2020-09-18 to 2020-09-25 is almost the same as 2020-09-12 to 2020-09-19 This might implies the Covid 19’s spread in Lucas(Swanton),Ohio has no obvious change after the election rally.
From the above graph, we could see after the rally event at 2020-09-21,the slope of line between 2020-09-18 to 2020-09-25 is almost the same as 2020-09-12 to 2020-09-19 This might implies the Covid 19’s spread in Vandalia, Ohio has no obvious change after the election rally.
From the above graph, we could see after the rally event at 2020-09-22,the slope of line between 2020-09-18 to 2020-09-25 is almost the same as 2020-09-12 to 2020-09-19 This might implies the Covid 19’s spread in Allegheny(Pittsburgh), Pennsylvania has no obvious change after the election rally.
From the above graph, we could see after the rally event at 2020-09-24,the slope of line between 2020-09-24 to 2020-10-01 is almost the same as 2020-09-18 to 2020-09-25. This might implies the Covid 19’s spread in Duval(Jacksonville), Florida has no obvious change after the election rally.
From the above graph, we could see after the rally event at 2020-09-25,the slope of line between 2020-09-24 to 2020-10-01 is almost the same as 2020-09-18 to 2020-09-25. This might implies the Covid 19’s spread in Newport News, Virginia has no obvious change after the election rally.
From the above graph, we could see after the rally event at 2020-09-26,the slope of line between 2020-09-24 to 2020-10-01 is almost the same as 2020-09-18 to 2020-09-25. This might implies the Covid 19’s spread in Dauphin(Middletown), Pennsylvania has no obvious change after the election rally.
| Date | City | State | Indoors. | Covid.Spread.After.Rally |
|---|---|---|---|---|
| 6/20/20 | Tulsa | Oklahoma | yes | Speed up |
| 6/23/20 | Phoenix | Arizona | yes | Speed up |
| 8/17/20 | Mankato | Minnesota | no | No effect |
| 8/17/20 | Oshkosh | Wisconsin | no | Slow down |
| 8/18/20 | Yuma | Arizona | no | Speed up |
| 8/20/20 | Old Forge | Pennsylvania | no | Speed up |
| 8/28/20 | Londonberry | New Hampshire | no | No effect |
| 9/3/20 | Latrobe | Pennsylvania | no | No effect |
| 9/8/20 | Winston-Salem | North Carolina | no | Slow down |
| 9/10/20 | Freeland | Michigan | no | Slow down |
| 9/12/20 | Minden | Nevada | no | Speed up |
| 9/13/20 | Henderson | Nevada | yes | Speed up |
| 9/17/20 | Mosinee | Wisconsin | no | Speed up |
| 9/18/20 | Bemidji | Minnesota | no | Speed up |
| 9/19/20 | Fayetteville | North Carolina | no | No effect |
| 9/21/20 | Swanton | Ohio | no | No effect |
| 9/21/20 | Vandalia | Ohio | no | No effect |
| 9/22/20 | Pittsburgh | Pennsylvania | no | Slow down |
| 9/24/20 | Jacksonville | Florida | no | No effect |
| 9/25/20 | Newport News | Virginia | no | No effect |
| 9/26/20 | Middletown | Pennsylvania | no | No effect |
From the above table we Could see, among all the rallies, only 38.1%(8/21) cities might have increased Covid 19 spread speed. Thus, it is hard to conclude that rallies have negative effect on Covid 19 spread.
From the above graph we could see after all indoors’ rally, the Covid 19’s spread speed increase. However, the among the outdoor rallies, the situation is much better that more than 75% of cities’ Covid 19 spread speed remain the same or even slow down.
We want to explore the sentiment of governors’ tweets about COVID-19 and Election in 2020. We also comapared the sentiment score and COVID-19 cases by state to see whether there’s correlation between them.
In the first part, we cleaned governors’ tweets and did a bunch of visualizations to get an overview of the tweets.
From the barplot above, we can see the top 20 words in governors’ tweets. We can see that words like Covid, mask, vaccine and virus get the most frequencies. We can also see that words like test, vote, work and spread are pretty frequent as well.
From the comparison wordcloud above, we can see the negative word cloud is dominated by words like virus, spread, cases, lost, etc. And positive word cloud’s got words like protect, mask, test, vaccine, etc.
The Republican comparison word cloud is pretty much the same as Democratic word cloud, which is dominated by words like virus, outbreak, etc.
Relationship between Average Sentiment Score of Governors’ tweets about COVID19 and Confirmed COVID Cases by State
From the leaflet map above, we can see that states like Arizona, Idaho, West Virginia, and Texas have relatively positive sentiment scores. On the other hand, states like New Mexico, South Carolina, Washington and New York have relatively negative sentiment scores.
Comparing the two leaflet maps above, we can’t see obvious correlation between sentiment score and Covid cases.
library("knitr")
knitr::opts_chunk$set(echo = TRUE, eval=TRUE, message=FALSE, warning = FALSE)
d=read.csv("governors_twitter.csv") #choose the "governors_twitter" file
governor_tweets <- readRDS("governor_tweets.RDS") #choose the "governor_tweets" file
library(igraph)
library(ggraph)
library(network)
library(ggnetwork)
library(statnet)
library(ggplot2)
library(dplyr)
library(tidyverse)
Network Analysis of governors from each state who mention Biden/Trump on Twitter
Dataset: Twitter data state governors of from 2020.6 to now
#Biden
mention.Biden <- unnest(governor_tweets, mentions_screen_name)%>% filter(mentions_screen_name == "JoeBiden")
mention_Biden <- subset(mention.Biden, select=c(screen_name, mentions_screen_name))
d.Biden <- mention_Biden %>% left_join(., d, by=c('screen_name'='twitter_handle'))
g1 <- graph_from_data_frame(d.Biden , directed = TRUE)
weight<-table(d.Biden$mentions_screen_name)
degree=igraph::degree(g1, mode = 'all')
plot(g1,
edge.color = "grey",
edge.size = sqrt(weight+1),
vertex.color = ifelse( d.Biden$party == "D" ,
"blue", "red"),
vertex.size = sqrt(degree+1),
edge.arrow.size = 0.05,
layout = layout_nicely(g1),
main = "The network of governors who mention Biden on Twitter")
In the graph above, the node size is determined by degree centrality, the node color is determined by parties (blue for Democrats and red for Republicans), the edge size is determined by the number of mentions between two nodes. We can learn that GovPritzker and GavinNewsom have mentioned Biden the most times on Twitter. In other words, the relationship strength between them is strong. Most governors mentioned Biden on Twitter are Democrats, which are in the same party with Biden.
#Trump
mention.Trump <- unnest(governor_tweets, mentions_screen_name)%>% filter(mentions_screen_name == "realDonaldTrump")
mention_Trump <- subset(mention.Trump, select=c(screen_name, mentions_screen_name))
d.Trump <- mention_Trump %>% left_join(., d, by=c('screen_name'='twitter_handle'))
g2 <- graph_from_data_frame(d.Trump , directed = TRUE)
weight<-table(d.Trump$mentions_screen_name)
degree=igraph::degree(g2, mode = 'all')
plot(g2,
edge.color = "grey",
edge.size = sqrt(weight+1),
vertex.color = ifelse( d.Trump$party == "D" ,
"blue", "red"),
vertex.size = sqrt(degree+1),
edge.arrow.size = 0.05,
layout = layout_nicely(g2),
main = "The network of governors who mention Trump on Twitter")
In the graph above, the node size is determined by degree centrality, the node color is determined by parties (blue for Democrats and red for Republicans), the edge size is determined by the number of mentions between two nodes. We can learn that GovDunleavy, BrainKempGA and KimReynoldsIA have mentioned Trump the most times on Twitter. In other words, the relationship strength between them is strong. Almost all governors mentioned Trump on Twitter are Republicans, which are in the same party with Trump. Compared with Biden, Trump is mentioned more by governors.